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In this paper we derive equations describing the performance of 
various adaptive echo canceller configurations operating in a linear, 
time-invariant environment. We relate the parameters in these equa- 
tions to measurable environmental factors, discuss their effect on per- 
formance, and verify the results empirically. 

In general, the performance of an echo canceller cannot be exactly 
predicted for speech inputs. Therefore, the derived equations assume 
a stationary constant power random input. However, it is shown that 
the results obtained in this manner give useful estimates of the per- 
formance to be expected with speech inputs. The similarities and 
differences of the residts for a constant power random input and 
speech input are discussed in detail. 

I. INTRODUCTION 

The echo problem in the telephone network is caused by the interac- 
tion of the following three factors: (i) The impedance mismatches 
that exist at hybrid junctions cause reflections of incident electrical 
waves, (u) The existence of a bi-directional transmission medium per- 
mits the reflected signal to reach the talker as echo. (Hi) Time delay 
due to the finite propagation time of a signal makes the echo annoying. 
Historically the problem has been alleviated by increasing trunk loss, 
balancing hybrids, applying four-wire circuits where practical, and 
providing echo suppressors. 1 

Echo suppressors are used when the echo delay exceed about 45 ms. 
An echo suppressor is a voice-operated device which switches a large 
loss in the echo path, as shown in Fig. la. This loss blocks the echo 
effectively but also tends to block speech from the near-end customer 

785 



786 



THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1971 



I(t) SPEECH FROM FAR-END CUSTOMER 



THIS ECHO 
SUPPRE5SOR 
PROTECTS FAR-END 
CUSTOMER \ 



SIMILAR 

CONFIGURATION 

ON FAR-END 



ECHO 
SUPPRESSOR 



AVv 



so da 

SWITCHED LOSS 



INPUT TC 
DOUBLE- 
TALKING 
DETECTOR 
TO RESTORE 
DUPLEX 
OPERATION 



ECHO PATH 



NEAR-END 
CUSTOMER 




ECHO PATH 



I(L) SPEECH INPUT FROM FAR-END CUSTOMER 



INPUT TO 

CONTROL NETWORK 

TO MAXIMIZE 

SUPPRESSION 



SIMILAR 

CONFIGURATION 

ON FAR-END 



V 



ECHO 
CANCELER 



THIS CANCELLER 
PROTECTS THE 
FAR-END 



REPLICA OF 
THE ECHO 



ECHO+NOISE 



NEAR -END 
CUSTOMER 




(b) 



Fig. 1 — Block diagrams showing how (a) an echo suppressor is applied in a tele- 
phono conned ion and (b) an echo canceller could be applied in a telephone con- 
nection. 



when he wishes to interrupt the far-end customer. This situation is 
known as double-talking. During double talking it is necessary to 
restore the connection to full duplex. Some speech mutilation (called 
chopping) and echo occur during these double-talking periods. It 
has been shown that these degradations become increasingly disturb- 
ing as the echo delay increases. 2,3 

The performance of echo suppressors on synchronous satellite cir- 
cuits is less than satisfying due to the very long delay of such circuits. 4 
A new approach to the echo problem, called adaptive echo cancella- 
tion,'" 7 has been suggested as a possible alternative. In an echo can- 
celler an approximation of the echo signal is automatically constructed 
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and subtracted from the actual echo with no impairment to duplex 
operation. 

In this paper, our aim is to analyze the operation of an adaptive 
echo canceller in a linear time-invariant environment. Our hope is to 
give the reader insight into the parameters which affect performance 
and their interrelatedncss. Our method of presentation is as follows. 
In Section II we discuss the environmental factors affecting perform- 
ance and we derive equations describing the operation of various 
echo canceller configurations in a linear time-invariant environment. 
Since we were unable to characterize an echo canceller for speech in- 
puts, the derived equations assume a stationary random input. How- 
ever, in Section HI we interpret the equations and show empirically 
that the results obtained give useful estimates of the performance 
to be expected with speech inputs. 

II. MATHEMATICAL DESCRIPTION OF THE ENVIRONMENT AND THE ADAPTIVE 
ECHO CANCELLER 

The echo paths that we will consider are assumed to be linear, time- 
invariant channels, not necessarily band-limited and otherwise general. 
This is not to imply that all real echo paths can be so characterized. 
In fact, time-varying echo paths have been observed and others are 
suspected of being significantly nonlinear. These deviations from the 
conditions assumed above may result in serious performance limita- 
tions. References 8 and 9 describe the effect on performance when the 
environment is either nonlinear or time-variant. 

A digital echo canceller, having filters with bandwidth, B, deter- 
mined by the sampling interval, T, can be used with all echo paths so 
long as the filter bandwidths arc at least as wide as that of the input 
signal, x(t), i.e.. T ^ 1/2/?. The same can be said for the bandwidths 
of the filters of an analog canceller. We will assume that these are 
also bandlimited to B Hz. 

Assuming xit) and the echo signal, //(/), are bandlimited to B Hz, 
we can equivalently represent them as sequences of the sampled values 
at times / = nT where n = 0, 1, 2, ■ • • . Similarly other signals 
pertinent to the echo canceller are discrete or continuous and have the 
independent variable nT or t, respectively. For the sake of brevity we 
will adopt a common notation, letting £ denote / or nT. Also the con- 
volution operation will be denoted as 

«(f) * /3(f) 
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which corresponds to 

f «( T )j8(f - r) dr 

J -co 

in the analog case and to 

t, "(WW - kT) 

in the digital case. Where other differences occur the specific variable 
will be used. 

Whether the echo canceller is digital or analog, it would be inserted 
into the connection as shown in Fig. lb. The customer on the left 
(far end) is being protected from echo by the canceller shown. The 
customer on the right (near end) is being protected by a similar echo 
canceller on the far end of the connection. 

Three factors that affect the performance of an echo canceller are 
the types of signals used, echo paths encountered, and echo canceller 
configuration. We will consider these three points separately. There 
are three different signals present in the echo canceller environment: 

(t) The speech of the far-end customer, called the input signal x(t). 
(ii) The speech of the near-end customer. When the near-end customer 
and far-end customer speak simultaneously, we have double- 
talking. This constitutes an interference to the echo canceller. 
(lit) Interfering circuit noise which is inherent to the echo path. 

The echo canceller must perform satisfactorily when these signals are 
present in all possible combinations. Circuit noise, denoted as p(t), is 
assumed to be a zero mean random process with variance a\ band- 
limited to B Hz. 

A block diagram of the canceller circuit used is shown in Fig. 2. 
The basic components of this canceller are: 

(i) A set of M filters having orthonormal impulse responses which 

are the first M members of a complete basis set. 
(ii) A control network which automatically weights and sums the 
outputs of the M filters to generate an approximation of the echo. 
(Hi) Devices to couple the canceller to the telephone plant. The A-D 
and D-A converters are required for an analog canceller operating 
in a digital plant or vice versa. The set of M filters have impulse 
responses X,(f), X 2 (f), • • ■ , X M (f). The output of each filter, de- 
noted as w m (f ) and given by 
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Fig. 2 — Block diagram of the structure of an echo canceller. 

w-cr) = ur) * *(r), (D 

is weighted by the value of the tap gain </ M (f). At f = the tap gain is 
set to some initial value (usually assumed to be zero). The sum of these 
weighted outputs is the approximation of the echo and is denoted as 
#(f). Thus we have 

flfl = £ gM)wM) (2) 

WI-1 

which is subtracted from i/(£) to give the cancelled echo denoted as 
e(£). The cancelled echo plus noise p(£) is operated on by a function F 
and then multiplied by a positive factor K. F may be any odd non- 
decreasing function. 
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We will consider the two cases: 

(i) F[>] = [•] and 

(tt) F[.] =sgn[-] = J 

l-l if [•] <0. 

The resulting signal is multiplied by w m {£) and integrated to yield the 
value of each g m (£) . 
The analog network is governed by the set of differential equations 

| g m (t) - KF[e(t) + P (t)]w m {t), m - 1, 2, ■ - • , M . (3) 
The digital network is governed by the set of difference equations 
g m {nT) = g m {nT - T) + KTF[e(nT - T) 

+ p(nT - T)]w m (nT - T), m = 1, 2, • • • , M. (4) 
We can write the error, e (£) , as 

e(f) ■ yix) - y(t) 

M 

= Z [cm - g m (t)]u>M) + ?(f), (5) 

m-1 

where 

9(f) - E c m X m (r)*a:(f). (6) 

The coefficients c m , m = 1, 2, • • • are the generalized Fourier coeffici- 
ents of the echo path transfer function H(f) over the bandwidth |/| ^ 
B relative to the complete basis set. They are given by the equation 

c m = f H(j)T m (f); m. = 1,2, ••• (7) 

J-B 

where A m (/) is the Fourier transform of A. m (£). The term q(£) is called 
the uncancellable part of e(£). 

Echo suppression achieved £ seconds after the start of canceller 
operation is defined as 



t The overbar denotes complex conjugation. 
* E denotes ensemble average. 
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Maximum achievable suppression is denoted as »S max and equals 
lim f _ n , [5(f)]. We define the average settling time t, to be the time 
in seconds required for the suppression 5(f) to each 98 percent 
of 5 max in decibels. 

We will now derive equations for maximum achievable suppression 
and average settling time for the two cases F[-] = [•] and F[-] = 
sgn [ • ] . To facilitate the derivations we define the column matrices 



Wit) * 



w,(r) 



Wu(t)- 



and 



fl(r) - c - G(r) - 



.Cm. 



L^«(f)J 



Using these matrices, we may write e(£) as 

•GO - «'(r)-JP(r) + ?(f) + 

and j/(£) as 

2/(f) = C'-W(?) +«0). 

In the derivations which follow we will assume: 



(9) 



(10) 



(i) The input signal, x(t), is a stationary random process having a 
rectangular power density spectrum 



P.(fl = 



; 



^5; 



(it) The circuit noise, p(t), is a stationary zero mean random process 

bandlimitted to B Hz and independent of x(t); 
(in) For the case F[-] = sgn [•] we will further assume that x(t) and 

p(t) are gaussian with zero mean; 
(iv) R({) is independent of both z(f) and p(f). 

With regard to the last assumption, it is clear that, since #(f) is a 
function of x(f) and p(f), it cannot truly be independent of x(f) and 
p(f ) . For reasonable values of the feedback factor K, the rate of change 

t An apostrophe denotes matrix transpose and a dot denotes scalar multiplica- 
tion. 
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of .R(f) will be much slower than that of x({) or p(f), so that the assump- 
tion is justified for a wide range of operating conditions. 

Substituting equations (9) and (10) into equation (8), it can be 
shown that 



S(fl = -10 log 



f \H(f)\ 2 df-C'-C + E[R'(i;)-R(!;)] 

J -R 



f \H(f)\ 2 df 



(11) 



The integral in the denominator of equation (11) is defined as the 
echo path energy which will be denoted as \f>. We see from equation 
(11) that S{£) is maximized for a given echo path if E[R'({) -R(t)] = 
0. The maximum value is 



s ( r) = -ioiog[^4 



(12) 



Actually this suppression may not be achieved because E[R'U) -R(0 ] 
may not vanish. However, equation (12) gives a theoretical limit on 
suppression as defined — this limit being a function of the basis set 
used and the filter set truncation. We will define an incompleteness 
(truncation) factor / as 

/ ^ A^e. as) 

Note that / is a nonseparable function of the environment and the 
echo canceller. That is to say, to calculate the incompleteness factor 
one must know the echo-path transfer function over the bandwidth of 
the input signal, the filter set used in the echo canceller and the num- 
ber of taps employed in the canceller. 

To find maximum achievable suppression and average settling time, 
we must evaluate the term E[R'{£) -R{0] for the digital and analog 
case under the two conditions of the function F. 

2.1 Evaluation of E[R'(t)-R(t)] for the Analog Case to Yield S max and t. 
Using the definition of R(i;) and equation (3), we may write 

| [R'(t)-R(t)] = -2KF[R'{t)-W(t) + q(t) + p(t)]R'(t)-W(t). (14) 

2.1.1 The Case F[-] = [•] 

For this case, we can write equation (14) as 



ADAPTIVE ECHO CANCELLER 



793 



i-AR'W-m] = -2K{[R'(t)-W(t)) 2 + [R f (t)-W(t)][q(t) + p(0]|. (15) 
at 



Solving equation (15) for the expectation of R'(t)-R(t) gives 

Kelt 



E[R'(l)-R(t)] = R'(0)-R(0) exp {-^j ' ^ " 
Substituting equation (16) into equation (11) yields 



(16) 



/ + Mexp(-fi 



■)]■ 



(17) 
(18) 



S(t) = -10 log 
As t -» oo, we have 

£ mB * = -10 log/. 
For the given assumptions, the maximum achievable suppression is 
not a function of circuit noise and is limited only by the incomplete- 
ness of the filter set. Of course, strictly speaking, S raax would be less 
than this limit by an amount depending on the correlation existing 
between R{t) and P (t). 
Denning the term 

, t . , / 0.98S 

«(0 = log I - 



10 



(19) 



the antilog of the suppression at the settling time t 8 , substituting this 
into equation (17) and rearranging we get 



t, = 



B log L+m - m 



(20) 



0.434/Co-, 

2.1.2 The Case F[-] = sgn [•] (hard limiter) 

As above, S mBX and t. may be derived yielding the following two 
equations : 

S m8X = -10 log/ (21) 



and 



*. = 



B 



K< 



4. 



2( 7 - m) + 2.35 log 



y - 
Lt + #J 



M + 



(22) 



where 



" = l 2B~~ 
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= u + 



a a M - c-oy 



2B 



7 = I <r P + 



g^- C / -C + /g / (0)-7?(0)] Y 

25 y • 



If the circuit noise p(7) can be neglected and if I :^ then equation (22) 
can be written as 



t = - F [d5fl'(0)-fl(0)]»[l - Vs(t)] 
where s(0 is the antilog of the suppression at time t. 



(23) 



2.2 Evaluation of E[R' (nT) ■ R(iiT)] for the Digital Case to Yield S max 
and t. 

Using the definition of R(£) and equation (4), we can write 
R'(nT)-R(nT) - R'(nT - T)-R(nT - T) 

= -2KTF[R'(nT - T)-W(nT - T) + q(nT - T) 
+ P (nT - T)]R'(nT - T) ■ W(nT - T) 
+ K 2 T 2 F 2 [R'{nT - T) ■ W(nT - T) + q(nT - T) 
+ P (nT - T)\W'(nT - T)-W(nT - T). (24) 

2.2.1 The Case F[-] = [■] 

Solving equation (24) for the expectation of R'(t)-R(t), it may be 
shown that S(nT) is given by 



S(nT) = - 10 log 



/ l + MKrr'ai 



a - 1 
ot - 1 . 



° A mj\ 



, R'(0)-R(0) „ . MK'T 2 

+ : a H 

if/ p 

for n = 0, 1, 2, • • • and where 

a = 1 - /OV*[2 - KT(M + 2)<#, < a < 1; 

(3 = MK 2 T 2 <rltt - C'-C) + iVK'T 2 * 2 * 2 .. 

Note that the limits on a arc necessary to yield a convergent system. 
We define the signal to noise ratio v at the output of the echo path as 



(25) 



(26) 



and 
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(27) 



The maximum suppression is given from the limiting value of S(nT) as 
n -* oo as 



S mnx = - 10 lo 
and t, is given by 

riog 
u = 



7 1 - 



a _ ! ) - v(fX _ 1} J » (28) 



*(«.) - s m „x 



R(0)-R(0) , IM(KTa;) 2 M(KTa'i) 2 



* 



a — 1 f(a — 1) _ 



log [a] 

2.2.2 77ie Case F[-] = sgn [•] 

With the hard limiter we can write equation (24) as 

R'(nT)-R(nT) - /?'(nT - T)R(nT - T) 

= -2KTsgn [R'{nT - T)-W(nT - T) + g(nT - 7') 

+ p(nT - T)]W'(nT - T)-W(nT - T) 

+ A' 2 T 2 Tr'(nr - T)-Jr>T - T). 



(29) 



(30) 



Using the same assumptions and analysis technique as used for the 
analog cast-, wo can derive the average value of equation (30) obtain- 
ing 

E[R'(nT)-R(nT)] - E[R'{nT - T)-R(nT - T)] 

E[R'{nT - T)-R(nT - T)] 



= -2K7W^r 



*t J E[R'(nT- T)-R{nT - T)] + j _,. 1 



+ MK 2 T 2 <r 2 . 



(31) 



Rather than attempt a solution of this nonlinear difference equa- 
tion, we will find only the limiting value of E\R'{nT) -RinT)]. For 
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E[R'{nT) -R(nT) ] to converge, we require that the right hand side of 
equation (31) be nonpositive. Using this inequality and solving for 
E[R'(nT)-R(nT)] gives 



16\fr 



An explicit expression for the average settling time t 8 for this case 
is not available. As an alternative, the analog equation for settling 
time t s , equation (22), with K replaced by KT may be used to predict 
settling time for this digital case. When using equation (22), we 
should use equation (32) to calculate s(l s ). 

III. EMPIRICAL RESULTS AND INTERPRETATIONS OF THE THEORETICAL 
RESULTS 

In this section wc will compare the theoretical results with empirical 
findings. We will then discuss the performance predicted by the 
equations as several of the parameters are varied. Finally we will 
demonstrate that although the equations were derived for a noise 
input, they yield useful information about the operation of the can- 
celler for speech inputs provided that the echo path is linear and time- 
invariant. The empirical results tabulated below pertain only to digital 
implementations, since an analog system was not available for testing. 

The echo canceller shown in Fig. 2 was simulated on a digital 
computer. Also an echo path, chosen to have characteristics which 
are similar to those of real echo paths which have been observed, was 
simulated on the computer. Experiments have also been performed 
incorporating various analog echo paths with the results in general 
agreement with predicted performance. For the sake of brevity the 
latter results are omitted. 

The measure of echo canceller suppression which we use to monitor 
canceller performance is defined by the equation 



\h(1) - £ gMK(f) 

S(t) = -10 log 4*- 1 



df 

(33) 



f \H(D\ 2 d1 
Equation (33) yields a measure of the goodness of fit across the entire 
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band at time £. It is not necessarily equivalent to the subjective echo 
reduction which a listener would perceive. In fact subjective testing on 
a very limited basis indicates that perceived loss may be greater than 
is indicated by this measure. It is important that this subjective factor 
be considered when comparing the predicted suppression of an echo 
canceller with what is considered to be necessary for adequate echo 
reduction. It is clear that equation (33) would give values of sup- 
pression identical to the values given by the previously derived equa- 
tions if the input to the canceller is noise with a rectangular power 
density spectrum. 

3.1 Random Noise Input Results 

3.1.1 F[-) = [■] 

In Fig. 3 we plot suppression for a typical simulation. The choice of 
parameters for the simulation is listed on the figure. The crosses indi- 
cate the results of the computer simulation [equation (33) ] which is 
compared against the values of settling time, t 8 , and maximum sup- 
pression, iSmnx , as predicted by equations (29) and (28) respectively. 
We also compare the results of the simulation against the suppression 
as a function of time predicted by equation (25) . We see from the 
figure that a high value of suppression is obtainable. 

In Fig. 4, we allowed the echo canceller to reach its maximum sup- 
pression with no circuit noise present. Then we introduced a high 
noise level, S/N = — 18 dB, for approximately 1.5 seconds and then 
removed it, simulating doubletalking. It is clear from the figure that 
the results are in very good agreement with the equations. 

From these results and numerous others using different echo paths 
and basis sets we draw the following conclusions: 

The assumption of the independence of R(£) from x(0 and p(£) 
is quite reasonable for suppressions of up to 40 dB, S/N ratios as low 
as —20 dB, and settling times of 0.3 second and greater. Therefore 
for random noise inputs we conclude that the derived equations are 
very accurate predictors of the performance of an echo canceller. 

We now focus our attention on the nature of the equations, and discuss 
the effects of various environmental factors upon them. 

In Figs. 5 through 7, we plot maximum suppression (28) versus 
KTa\ for 100 taps and the incompleteness factors I = 0.01, 0.001, 
0.0001. In all three figures *S max decreases as the S/N decreases. Note 
that for small values of KTa 2 x , as the incompleteness factor I decreases, 
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Fig. 4 — The same as Fig. 3 except for a different choice of parameters. 
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the increase in suppression becomes larger for equal increments in 
S/N. That is, circuit noise (and double-talking) becomes more trouble- 
some as the echo canceller is designed to give higher and higher values 
of S max . 

Note that for fixed S/N, K, and T, an increase in the input signal 
power reduces the maximum achievable suppression. On the other 
hand, for fixed a\ and KTa\ < 10 -3 , a change in signal has no appreci- 
able affect upon maximum suppression. However from equation (25), 
we see that convergence is assured if and only if < a < 1. This in 
turn, implies that KTo x < 2/(M -f- 2). Therefore, we cannot make 
KTo 2 x arbitrarily large. 

Figures 5 through 7 were calculated for M = 100. In order to investi- 
gate the sensitivity of <S mttX to M we have plotted S max versus M for 
KTol = 0.0001 and / = 0.001 and 0.01 in Fig. 8. Observe that S max 
is a weak function of M . Thus, Figs. 5 through 7 can be used to predict 
<S m « for given i" and KTa 2 x with little regard for M . 

In Figs. 9 through 11, we plot settling time versus KT<r 2 x for various 
choices of I and S/N. Note that a decreasing S/N results in a decreasing 
settling time. This could lead to the erroneous conclusion that high 
noise levels help convergence. We find, however, that as the noise level 
increases, S max decreases. In some cases of very high noise levels, the 
echo canceller could even provide a net gain. Intuitively it is clear that, 
starting with zero suppression, it should take less time to settle to the 
lower level of suppression. For example, consider Fig. 9, with KTa\ = 
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Fig. 5 — Theoretically maximum suppression versus KTo x a for an incompleteness 
factor of 0.01 and various echo-to-noise ratios. 
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Fig. 6 — The same as Fig. 5 but with an incompleteness factor of 0.001. 

10~ 3 . It shows that for S/N = °o the settling time is approximately 
3.6 X 10 3 iterations versus 1.6 X 10 3 for S/N = -20 dB. However, 
Fig. 5 shows that S max is 19 dB and — 7 dB respectively. This situation 
also illustrates what may happen when a strong interference such as 
double-talking occurs. The interference will cause divergence to a re- 
duced suppression and may even cause a net gain. We also conclude 



30 - 











— 




S/N = 


= oo 








' 




























































EQUATION 28 




20 

1 
10 
























F(-) = (•) 
I = 0.0001 
M = tOO 
















odB~~- 








<^__^ 














































-to"-- 


















c 












«*^^ 










<-^ 
























-20"-^» 


















-10 




















*- 


■«^^ 








































-20 

































10-5 



4 6 8 |0 - 4 2 4 6 8 |Q - 3 Z 4 8 8 , - 2 

KTo, 2 



Fig. 7— The same as Figs. 5 and 6 but with an incompleteness factor of 0.0001. 
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Fig. 8— Theoretically maximum suppression versus number of taps for various 
echo-to-noise ratios and incompleteness factors. 

that for fixed S/N and KT, the larger the input signal power, the faster 
the settling time will be. However, for the reasons explained previ- 
ously, the signal power cannot be made arbitrarily large. For constant 
noise power and fixed KT, it can be seen that settling time is decreased 
and the rate of increase of suppression is made larger with increased 
input signal power. 

In Fig. 12 we plot settling time as a function of the number of filters 
M for several values of S/N and I. We see that settling time is rela- 
tively insensitive to M and that Figs. 9 through 11 may be used to 
estimate settling time irrespective of M. 

3.1.2 F[-] = sgn [•] 

We now consider the echo canceller with a hard limiter in the feed- 
back loop. We cannot predict the exact temporal performance of this 
canceller configuration because we have no solution to the governing 
difference equation (31). However, we may estimate it by using the 
solution of the analog differential equation and replacing K with KT. 
Since this imposes no limit on maximum suppression, we must combine 
this with the limiting value of >S milx given by equation (32). This tech- 
nique yields a reasonable prediction of the operation. For this case 
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Fig. 9 — Settling time versus KTa r 2 for an incompleteness factor of 0.01 and 
various echo-to-noise ratios. 



there are too many variables to present easily a set of curves which 
describes the operation quantitatively. Therefore we will make some 
qualitative observations which are generally true. We will use Figs. 
13 and 14 as typical examples but we emphasize that these curves are 
only quantitatively valid for the particular choice of parameters given. 
We observe that S lllslx and settling time are inversely proportional 
to o-.,. . For fixed KTa.,. , »S„ mx decreases as S/N decreases. For constant 
signal to noise ratio, «S nillx decreases with increasing KTa r . However, 
for constant noise level, «S,, I11X is relatively insensitive to changes in 
KTvj. over a wide range. Note too that for KTa,,- sufficiently large the 
canceller may introduce a net gain. For fixed S/N, settling time is a 
decreasing function of KTtr x . For fixed KTa x a decrease in S/N pro- 



ADAPTIVE ECHO CANCELLER 



803 



duces an increase in settling time, while for the case F\-] = [•] the 
opposite was true. We will now turn our attention to the operation of 
the echo canceller with speech; we will attempt to interpret the equa- 
tions in this new light. We will also attempt to show empirically that 
the results we obtain give useful estimates of performance. 

3.2 Operation With Speech 
The fundamental differences between noise and speech are: 

(?') The .short time (50 ms) average power of speech is erratic from 
time interval to time interval whereas by comparison it is rela- 
tively constant for the random noise. 

(//) The spectral density of speech is nonuniform, and depends on 
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Fig. 10 — The same as Fig. 9 but with an incompleteness factor of 0.001. 
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Fig. 11 — The same as Figs. 9 and 10 but with an incompleteness factor of 0.0001. 

which phoneme is spoken and who speaks it. In fact, the speech 
power is usually concentrated only a few narrow frequency 
bands at a time. However, if enough time is allowed to elapse, it 
is reasonable to assume that the speech power* will eventually 
scan the entire available bandwidth. 

At present, no adequate statistical description of a speech signal 
accounting for the above properties is available. Using the long-time 
(several seconds) estimate of average speech power, we have found 
that the results derived in this paper for random noise may be used as 
an estimate of the performance which can be expected with speech 



* Strictly speaking, this is also true for the random-noise case. However, for 
noise, the power density spectrum may be considered uniform for a shorter period 
of elapsed time. 
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Fig. 12 — Settling time versus number of taps for incompleteness factors of 0.01 
and 0.001. 























EQUATION 32 
F(0 = SGN(-) 
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Fig. 13 — Theoretically maximum suppression versus KTa x for an incomplete- 
ness factor of 0.01 and various echo-to-noise ratios. 
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Fig. 14 — Settling time versus KTa x for an incompleteness factor of 0.01 and 
various echo-to-noise ratios. 



inputs. However, the blind use of the equations may give erroneous 
and optimistically deceiving estimates. We show below how the equa- 
tions should be used to predict the maximum achievable suppression 
obtained from the echo canceller with a speech input, and discuss the 
significance of the settling time estimates. 

Because of the variation in speech power level on a short-time (50 ms) 
basis, we find that the rate of convergence of an echo canceller is erratic. 
To illustrate this, assume for the moment that we have available 
speech with a uniform spectral density but with short-time power level 
variations. For such an input signal, we would find that the operation 
would be as predicted by the equations with a\ (short-time power 
estimate) considered a function of time. 
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The time variation of the speech spectrum causes individual tap 
settings of the echo canceller to become correlated. This effect is not 
taken into account in the equations. This correlation causes the set- 
tling-time to be longer than the equations predict. Because of the 
variations in the spectral density of speech over time, we find that the 
echo canceller converges on a frequency selective basis. Figure 15 
shows the plot of suppression versus time for the echo canceller with a 
random noise input and F [ • ] = sgn [ • ] . The suppression was measured 
in 20 adj acent frequency bands approximately 200 Hz wide from to 
4000 Hz. The suppression in each band was computed by integrating 
only over that band using equation (33). Two of these bands are 
shown in Fig. 15. Note that each one converges at approximately the 
same rate to a limiting value where it then begins to oscillate. Note 
also that for each band the limiting suppression is reached very close 
to the predicted settling time of 0.3 second. The other 18 bands be- 
haved similarly. Similar results were obtained with F[-] = [•]. 

Figure 16 demonstrates what happens when speech is used instead 




0.00 0.75 1.00 

TIME IN SECONDS 



Fig. 15 — Suppression as a function of time in the 400- to 500-Hz and 3200- to 
3400-Hz frequency bands for a random noise input signal. 
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Fig. 16— Suppression as a function of time in the 400- to 600-Hz, 1000- to 1200- 
Hz and 3200- to 3400-Hz bands with a speech input signal. 



of random noise. The designated value of settling time in this experi- 
ment was t s = 1 second, using the long-time average (5 seconds) speech 
power as <j\ . Note that the suppression increases at different rates. For 
example, between t = 0.75 second and t = 1.25 seconds, the suppression 
in the 400- to 600-Hz band increases 9 dB while the suppression in the 
other 2 bands increases 4 dB at most. Between t = 1.25 seconds and 
i = 1 .5 seconds, however, the rate of convergence becomes most rapid in 
the 1000-1200 Hz band. This frequency selective convergence is un- 
doubtedly due to the variation in the spectral distribution of speech 
power. One result of this is a longer overall settling time based on our 
measure of suppression. The experiment indicates that although the 
overall settling time may be longer, the echo canceller converges in 
some frequency bands more rapidly than average. The bands where 
this speedy convergence takes place are those where the speech power is 
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greatest at a given time. Because of this, we believe that the perceived 
settling time would be shorter than is indicated by our measure of 
suppression, equation (33).* Over a long time (several seconds), this 
selective convergence results in a fit almost equivalent to that of the 
random noise across the bandwidth of the speech input. We find that 
the maximum suppression S max achieved with a speech input is very 
nearly equal to that given by the equations when the long-time average 
speech power is used for a 2 x . 

Figure 17 shows some typical results which were obtained for a 




2.0 2.5 

TIME IN SECONDS 



Fig. 17— Comparison of the results of a computer simulation of an echo can- 
celler for a speech input with those predicted by the equations. 



simulation with a speech input. Curve (A) shows the results of simula- 
tions where S(nT) is calculated every 50 ms using equation (33). 
Curve (B) shows the performance as predicted using equation (25) 
and a long-time average (5 seconds) speech power for a\ . The settling 
time was 2.5 seconds. Curve (C) shows the resulting prediction when 



* A need for subjective tests which relate suppression (Equation 44) to per- 
ceived suppression exists. 
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a\ is replaced by a function of time o^(f), which in this case is the 
successive short-time (50 ms) average speech power. Note that Curves 
(A) and (C) are almost identical in shape. However, (C) settles more 
rapidly to its final value as expected. As explained, the nonuniformity 
of the speech spectrum causes the longer settling time. Had the simu- 
lation been plotted for time longer than 4.5 seconds, we would observe 
that (A) would converge to its limiting value near «S mnx = 22.4 dB. 

Another typical case is shown in Fig. 18. A hard limiter was used in 
the feedback loop of the canceller. The same segment of speech was 
used here as used in Fig. 17. The value of K was chosen to give a 
settling time for random noise of 1 second. Note that the echo canceller 
converged to within 1 dB of S^x in 3 to 4 seconds. 

The effects of high noise levels are shown in Fig. 19. With the S/N 
ratio computed to be — 10 dB, we see that the echo canceller converged 
in approximately 1.5 seconds to *S mux = 11 dB, where the suppression 
then tended to vary around this value. This demonstrates that the can- 
celler converges to Sm nx as predicted in the presence of high levels of 
noise. Such a strong noise simulates the effect of double-talking. Had the 
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Fig. 18— The same as Fig. 17 with KTa x 2 = 3.5 x 10" r '. 
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Fig. 19— The same as Fig. 17 with S/N = —10 dB. 

noise been introduced after the canceller had converged to some higher 
8am , the canceller would begin to diverge to the limiting value of 
S = 11 dB. The rate of divergence with speech double-talking is slower 
than that with random noise. Thus we would find that the effect of 
double-talking as predicted by the equations would be more severe 
than it actually is. Also, such an interfering signal causes frequency 
selective divergence of the echo canceller's transfer function. The 
divergence is greatest where the interfering signal spectrum is largest. 
This is not necessarily where the input signal spectrum is largest. 

In summary, we see that a long time estimate of speech power can 
be used in the equations to give a good estimate of the limiting value 
of suppression Smn X . Also we see that the canceller performs generally 
as predicted with a speech input, and in the presence of strong inter- 
fering noise. In general, the settling time of the canceller is longer 
than that predicted by the equations. The settling time may be reduced 
by increasing K but this must be weighed against the resulting de- 
crease in S max . Also if K is made too large convergence may not take 
place at all. 



IV. SUMMARY 

We have described the performance of an adaptive echo canceller 
operating in a linear, time-invariant, noisy environment. Both digital 
and analog implementations were considered. In both cases the echo 
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cancellers were assumed to consist of a set of M filters having ortho- 
normal impulse responses which were selected as the first M member 
of a complete basis set. A weighted sum of the filter outputs approxi- 
mates the echo signal. The approximation is substracted from the real 
echo and the difference signal is used to continually improve the 
approximation so that the cancelled echo power tends toward a mini- 
mum. We have used the mean-squared value of the difference between 
the transfer functions of the echo canceller and the echo path uni- 
formly weighted over the bandwidth of the input signal as a measure 
of suppression. This measure is not necessarily equivalent to the 
subjective echo loss (apparent loss perceived by listeners) other than 
on a relative basis. Sets of equations were derived giving maximum 
achievable suppression and settling time of the echo canceller. We 
have shown that despite certain simplifying assumptions made in their 
derivations, the equations accurately describe the performance for a 
random noise input. 

Families of curves — Figs. 5 through 7, 9 through 11, 13, and 14 show 
maximum suppression and average settling time for a range of incom- 
pleteness factors J, S/N, and a factor related to the input power. The 
results of simulations are shown to be in close agreement with the 
predictions. 

For a speech input we have found that the equations for maximum 
suppression can be used to predict performance. The long-time (several 
seconds) average speech power is used in the equations. The short- 
time variability of speech power and spectral variations of the speech 
signal cause the settling time of the echo canceller to be longer than 
that given by the equations. We have found that during convergence a 
speech input causes the transfer function of an echo canceller to con- 
verge on a frequency selective basis — the fit being best where the 
power spectrum of the input is greatest. We find that, given enough 
time, the transfer function of the echo canceller converges to essen- 
tially a uniform fit of the echo path transfer function over the band- 
width of the input signal. We have also found that an interfering 
speech signal (such as exists during double-talking) will cause the 
echo canceller to diverge on a frequency selective basis. The rate of 
divergence with speech interference is less than that for the random 
noise interference. 

Before concluding, two final points should be reemphasized. All 
the previous analysis is only valid when the environment is linear and 
time-invariant. At present, we suspect that certain systems (com- 
pandored systems for example) exhibit non-negligible nonlinearities. 
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For such systems, the previous analysis may not suffice depending on 
the type and magnitude of the nonlinearities. In any case it becomes 
extremely dangerous for compandor type nonlinearities to attempt to 
relate the performance of an echo canceller to a speech input from 
the white noise equations given previously. 

Also, it should be stressed, that the measure of performances we 
have defined are objective in nature. These measures are not necessar- 
ily equivalent to the subjective echo reduction which a listener will 
preceive. A need exists to relate the objective and subjective. 
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